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1      Introduction 

Let  Ai,  •  •  ■ ,  i4„  be  events  in  a  probability  space.  The  basic  probabilistic  method  says: 

If  X^Pr[A,]  <  1,  then  Pr[Ai.]  >  0 

Explained  more  basically,  this  method  proves  the  existence  of  a  configuration,  first  by  cre- 
ating a  probability  space  whose  points  are  configurations  and  second,  by  showing  a  positive 
probability  that  the  "random"  configuration  meets  the  desired  criteria.  So  the  Ai  are  typ- 
ically "bad"  events,  and  the  probabilistic  method  guarantees  the  existence  of  a  "good" 
configuration.  The  method  was  first  introduced  by  Paul  Erdos  in  proving  lower  bounds  for 
Ramsey  numbers.  Since  then  it  has  been  a  useful  tool  in  graph  theory  and  combinatorics. 
Spencer  [14]  provides,  with  an  algorithmic  flavor,  an  excellent  treatment  of  different  tech- 
niques similar  in  spirit  to  the  basic  probabilistic  method. 

In  this  paper,  we  discuss  algorithmic  aspects  of  some  of  the  (basic  and  non-basic)  prob- 
abilistic methods.  First  we  describe  the  power  of  the  basic  method  by  showing  instances 
wherein  the  probabilistic  proof  can  be  converted  to  an  efficient  deterministic  construction. 
We  survey,  in  addition,  two  techniques  useful  when  the  basic  method  fails;  i.e.  when  the 
"random  configuration"  fails  to  meet  the  desired  criteria.  The  first  is  a  collection  of  meth- 
ods with  an  element  of  linear  algebra.  These  methods  are  very  constructive  in  nature;  i.e. 
whenever  we  can  prove  the  existence  of  solutions,  it  is  also  the  case  that  there  is  an  "effi- 
cient" way  of  computing  the  solution.  As  we  shall  see  these  provide  useful  approximation 
algorithms  for  some  "hard"  problems  destined  to  belong  to  the  NP-Complete  class.  The 
situation,  however,  is  quite  different  with  the  second  technique,  the  well  known  Lovasz  Lo- 
cal Lemma.  The  local  lemma  is  used  to  prove  various  combinatorial  results  for  which  there 
exists  no  other  known  way  of  proving.  Moreover,  there  are  no  known  efficient  algorithms  to 
construct  any  of  these  results,  and  the  problem  of  finding  more  effective  versions  of  these 
proofs  remains  an  open  and  intriguing  challenge. 


2      The  Probabilistic  Method 

Consider  tournaments  of  n  players  in  which  every  player  plays  a  game  and  there  are  no 
draws.  By  directing  an  edge  from  i  to  j  when  player  i  beats  player  j,  we  can  represent  a 
tournament  as  a  complete  directed  graph  on  n  vertices.  A  tournament  r„  has  property  Sk 
if  for  every  k  players  Xi,-  ■  •  ,Xk  there  is  some  other  player  y  who  beats  them  all.  The  basic 
probabilistic  method  shows 

Theorem  2.1   For  every  k  there  is  a  finite  T„  with  property  Sk- 

Proof.  Consider  a  random  r„,  i.e.  every  game  is  determined  by  the  flip  of  a  fair  coin.  For 
a  set  A'  of  k  players  let  Ax  be  the  (bad)  event  that  no  y  ^  A'  beats  all  of  X.  Each  y  ^  X 
has  probabihty  2~''  of  beating  all  of  A'  and  there  are  n  —  k  such  y,  all  of  whose  chances  are 
mutually  independent,  so 

Fr[Ax]  =  il-2-'r-'. 

Hence 

-k\n-k 


Pr[BAD]  =  Pr[V^A']  <  f^](l  -  2"'^)' 


We  choose  n  so  that 

For  this  n,  AAx  has  positive  probability  so  there  is  a  GOOD  point  in  the  probability  space, 
i.e.,  a  tournament  T  with  property  Sk-  The  above  condition  is  roughly 

n  e  <  1 

so  we  need 

n  >  2''k\\n  2)(1  +  o(l)). 

The  derivation  of  lower  bounds  for  Ramsey  numbers  was  one  of  the  first  applications  of 
the  probabilistic  method.  Let  R{k,t)  denote  the  Ramsey  function,  i.e.  the  minimal  n  so 
that  if  the  edges  of  A'^  are  two-colored  (Red  and  Blue),  then  either  there  is  a  Red  Kk  or  a 
Blue  Kf  To  show  R{k,t)  >  n  we  must  prove  the  existence  of  a  coloring  of  7v„. 

Theorem  2.2  // 

_^)pra  +  (;)(i-p)(;)<i 

for  some  p  G  [0, 1],  then  R{k,  t)  >  n. 

Proof.  Color  A'„  randomly  with  Pr[x(i,i)  =  Red]  =  p.  For  each  fc-set  S  let  As  be  the 
(bad)  event  that  S  is  Red,  and  for  each  <-set  T  let  Bj  be  the  (bad)  event  that  T  is  Blue. 
Then 

Pr[^s]  =  p^'\       Pt[Bt]  =  (1  -  i>)(^) 


so  Pr[  BAD  ]  is 

Fr[\/ As  ^ySr]     <    X:  Pr[As]  +  E  N^r] 

=  (:)p<''^(:)a-p)<''<i- 

With  positive  probability,  x  has  neither  Red  Kk  nor  Blue  Kf  Thus  a  (GOOD)  coloring 
exists! 

Let  us  calculate  the  specific  case  of  i?(4,<).    We  want  (^)p^  =  cn'^p^  <  1  so  we  take 

p  =  en-2/^.  Now  we  estimate  (")  by  n\  1  -  p  by  e"'',  so  we  want  n'g-P''/^  <  1.  That  is, 
t  >  (2/p)ln  n  =  A'n^/^ln  n.  Expressing  n  in  terms  of  t 

We  shall  see  shortly  that  this  can  be  improved  . 

The  deletion  method.  This  method  typically  finds  a  GOOD  configuration  (tournament 
or  coloring)  by  taking  a  random  configuration  and  making  a  "small"  modification.  Let  us 
see  how  it  works  on  our  off-diagonal  Ramsey  numbers. 


Theorem  2.3 


iJ(t,<)>n-0p«)-(")(l-p)(y 


for  any  n,p    0  <  p  <  1. 


Proof.  Let  A'  be  the  number  of  monochromatic  (UGLY!)  sets,  when  we  color  Kn  randomly. 
Further,  let  Xs  be  the  indicator  random  variable  of  the  event  As  that  S  is  monochromatic. 
By  linearity  of  expectation 

E{X)  =  Y.E{Xs)  =  EPriAs]  =  (^)p(^)  +  (^^i^-Pp 

There  is  a  point  in  the  probability  space  for  which  X  does  not  exceed  its  expectation.  That 
is,  there  is  a  coloring  with  at  most 


.k 


pa)  +  Q(i_,)a) 


UGLY  S.  "Fix"  that  coloring.  For  each  such  S  select  a  point  x  E  S  arbitrarily  and  "delete" 
it  from  the  vertex  set.  The  remaining  points  have  no  monochromatic  Kk  or  Kt  and  hence 
the  theorem. 

The  asymptotics  show  improved  results.  For  example,  if  we  select  p  =  en~^/^  we  get 


R{^,t)  >  ct'lW  t  =  t 


2/l„2   ._. 2+0(1) 


We  shall  see  further  improvement  in  this  bound  in  Section  4  using  the  local  lemma  . 

Lovasz  Local  Lemma.  We  know  by  basic  probability  that  if  there  are  n  mutually  inde- 
pendent (bad)  events,  each  with  probability  <  1,  then  the  probability  of  none  of  the  bad 
events  happening  >  0.  The  local  lemma  guarantees  the  same  in  spite  of  some  dependence 
among  the  events.  Besides  being  very  useful,  the  beauty  of  the  lemma  is  that  it  requires 
only  basic  probabilty  to  prove.  We  denote  the  dependence  among  events  j4i  ,  •  •  •  ,  A„  by  a 
dependency  graph  G  on  vertices  1,  •  •  •  ,n  where  for  every  i,j  A,  is  mutually  independent  of 
A  J  implies  {i,j]  ^  G. 

Theorem  2.4  (Symmetric  Case).  Let  .4i,---,A„  be  events  with  dependency  graph  G  such 
that 

Pr[A,]  <  p  for  all  i,     deg  (i)  <  d  for  all  i  and  Adp  <  1, 

then  Ft[AA,]  >  0. 

Proof.  We  prove  by  induction  on  s  that  if  |  5  |<  5,  then  for  any  i 

Pr  [a,  I  AjesAj]  <  2p. 

For  S  —  <f>  this  is  immediate.  Renumbering  for  convenience  so  that  i  =  n,S  =  {1, ...  ,5} 
and  { i ,  X }  0  G  for  x  >  d^  we  get 

PrU     \A   ..    A]       P^l^nA,  ■  ■  ■  A,  \  A,+,  ■  ■  ■  A,] 

Pr[.4i  •  •  •  Ad  I  Ad+i  ■■■A,] 

The  numerator   <  Pr[A„  |  Ad+i  ■  ■  ■  A,]  =  Pr[.4„]  <  p 
We  can  bound  the  denominator 

d 
Pr[Ai  ■  ■  ■  Ad  \  Ad+^  ■  ■  ■  A,]     >     l-^Pr[A.  I  Ad+i---A,] 

1=1 
d 
>     1  — ^2p       (induction) 

=    l-2pd>-. 
2 

Hence  we  have  the  quotient  <  p/|  =  2p,  completing  the  induction.  Thus 
Pr[Ai  ■  ■  ■  i„]  =  n  Pr[i,  I  Ai  •  •  ■  i._i]  >  n(l  -  2p)  >  0. 

i=l  1=1 

Note.  The  best  possible  constant  (in  place  of  "4")  in  the  lemma  turns  out  to  be  e.  There 
is  a  more  general  version  of  the  lemma  which  we  shall  see  in  Section  4. 


3      Constructive  Methods 

In  this  section  we  discuss  two  issues.  The  first  concerns  instances  in  which  probabiUstic 
existential  proofs  can  be  adapted  to  provide  deterministic  constructive  algorithms.  Spencer 
[14]  and  Raghavan  [11]  discuss  such  a  methodology  using  an  interesting  "method  of  condi- 
tional probabilities" .  The  second  issue  illustrates  instances  when  the  probabilistic  method 
"fails" ,  but  linear  algebra  yields  significant  results. 

3.1      Determinism  via  Randomization 

Consider  the  following  lattice  approximation  problem.  Given  an  n  x  r  matrix  C  in  which 
Cij  e  [0, 1]  for  all  I,  j;  and  an  r  -vector  p  =  (pi,  •  •  •  >Pr)  where  each  pj  is  a  real  number.  We 
wish  to  find  a  lattice  point  q  =  {q\,- •  ■  ,qr)  which  is  "close"  to  p  and  which  gives  a  bound 
on  the  discrepancies 

r 

in  terms  of  the  inner- products  s,  =  Ej=:i  c.jPr  The  point  p  may  be  thought  of  as  the 
solution  to  the  relaxation  hnear  program;  we  wish  to  find  a  feasible  lattice  point  that 
is  "nearby."  Spencer  [13]  proved  that  there  always  exists  a  lattice  point  so  that  dj  < 
6y/n,  for  all  i.  However  the  proof  is  not  known  to  be  constructive.  Spencer  and  Raghavan 
obtain  constructive  (albeit  weaker)  bounds  using  "randomized  rounding":  set  each  qj  to  1 
with  probability  pj,  independently  with  all  the  other  g,.  i.e.  each  q^  is  a  BernouUi  trial  with 
E[qj]  =  Pj.  Consider  the  random  variable  y,  =  Yl'j^i  Cij^j-  Note  that 

d,  =\y,-  s,  I    and  E[y,]  =  Si. 

They  prove  the  following  general  theorem. 

Theorem  3.1  Let  ai,---,ar  be  reals  in  (0,1].  Let  Xi,--,Xr  be  independent  Bernoulli 
trials  with  E[Xj]  =  Pj.  Ify  =  Z^j=i  o-jXj  is  the  random  variable  with  E[y]  —  Yl]=\  o-jPj  =  "^j 
then  for  6  >  0 

Pr[|  y  -m\>  6m]  < 


e' 


(1  +  6)1+* 

In  other  words,  if  B{m,S)  denotes  the  bound  on  the  probability  that  the  weighted  sum  of 
Bernoulli  trials  with  expectation  m  exceeds  (1  +  6)m,  for  positive  8  : 

B(m,6)  =  [eV(l  +  <5)^+*]'". 

And  if  we  denote  by  D{m,x)  the  deviation  that  results  in  the  bound  on  the  tail  probability 
being  x  : 

B{m,D{m,x))  =  x. 

We  are  now  in  a  position  to  describe  the  existence  proof  for  the  lattice  approximation 
problem. 


Theorem  3.2    There  exists  an  integer  approximation  vector  q  such  that 

d,  <  SiD{si,  l/2n). 

Proof.  Select  the  integers  q.j  using  randomized  rounding.  Let  B,  be  the  bad  event  that 
di  =\yi  —  Si  \  exceeds  the  bound  in  the  theorem.  Then  by  the  definitions  above, 

Pv{y,  >  s,  +  s,D{s„  l/2n)]  <  l/2n 

Pr[y,  <  Si  -  SiD{si,  l/2n)]  <  l/2n 

i.e.  Pr[S,]  <  1/n.  Since  there  are  n  possible  bad  events,  the  probability  that  the  vector 
produced  by  randomized  rounding  is  BAD  is  <  n(l/n)  =  1.  Thus  there  exists  a  GOOD  q  . 

Method  of  Conditional  Probabilitites.  We  now  outline  the  deterministic  algorithm  to 
compute  q  =  {qi,-  ■  ■ ,  qr)-  Imagine  the  computation  modeled  as  a  decision  tree  T  (complete 
binary  tree)  of  r  levels.  Assigning  the  variables  qi,q2,-  ■  ■  in  sequence  to  1  or  0  corresponds 
to  taking  a  left  or  right  branch,  walking  down  T  from  the  root  to  a  leaf.  Each  leaf  is  "good" 
or  "bad"  depending  on  the  g-vector  it  corresponds  to.  Randomized  rounding  is  equivalent 
to  taking  the  left  son  at  level  j  with  probabiUty  pj,  and  the  right  son  with  probability 
1  —  Pj.  We  will  be  done  if  we  can  walk  down  the  tree  to  a  good  leaf  in  polynomial  time.  Let 
Pjilii  ■  ■  ■  1  Qj-i)  denote  the  conditional  probability  of  a  bad  event  occuring  given  gj, . . . ,  gj_i 
and  assuming  that  randomized  rounding  is  used  to  compute  g^, . . .  ,  Qr-  (In  particular  P(leaf) 
is  the  probabihty  that  we  reached  a  bad  leaf.  )  Then 

PAQi,-  ■  •  ,9j-i)  =  PjPj+iigu-  •  ■  ,9j-i,  1)  +  (1  -  Pj)Pj+i{qi,-  •  •  ,gj-i,0)  => 

-Pj(9i>---,9j-i)  >    min  {Pj+i(9i,  •  •  ■ ,  g^-i,  1),  P,+i(gi,  •  •  ■  ,9^-1, 0)} 

Combining  the  above  recurrence  with  the  fact  that  there  exists  at  least  one  good  leaf  (i.e. 
Pi  <  1),  we  have  the  following  algorithm: 

For  j  =  1  to  r,  at  level  j  we  set  qj  to  0  or  1  as  to  minimize  Pj+i  . 

This  minimization  procedure  is  guaranteed  to  lead  us  to  a  good  leaf  (P(leaf)  =  0),  since 
each  leaf  is  either  bad  or  good!  For  completeness,  we  must  add  that  Spencer  and  Raghavan 
replace  Pj(gi,  •  •  •  ,qj-i)  by  a  suitable  upper  bound  function  so  that  the  conditional  proba- 
bilities can  be  efficently  estimated.  For  brevity  we  omit  the  details  . 

Application  to  NP-C  problems.  Raghavan  applies  this  method  to  NP-hard  integer 
programs  arising  in  packing,  routing,  and  maximxmi  multicommodity  flow  demonstrating 
polynomial  time  approximation  algorithms.  We  refer  to  [11]  and  [8]  for  these  and  other 
applications. 


6 


3.2      Steinitz  constant  and  flow  shop  problem. 

In  1913  Steinitz  [15]  proved  that  for  every  normal  d-dimensional  space  there  exists  a  constant 
B  such  that  for  every  finite  collection  of  vectors  Xi, . . . ,  x„  satisfying  ^"=i  x,  =  0,  ||  x;  11<  1 
(for  all  i)  there  exists  a  permutation  it  such  that  for  all  positive  integers  k  <  n  we  have 
II  Yl'i=i  ^■!r{i)  II  <  B.  The  constant  B  here  depends  only  on  the  dimension  d  and  the  given 
norm  ;  the  smallest  possible  value  of  B,  denoted  by  Bmin,  is  called  the  Steinitz  constant. 
Till  recent  years  no  polynomial  (in  d)  upper  estimate  was  known  for  Bmm-  Grinberg  and 
Se\'ast'yanov  [?]  showed  that  Bmm  <  d  for  any  norm  (not  necessarily  even  symmetric).  It 
is  worthwhile  looking  at  the  proof  of  this  rather  striking  result. 

The  proof  makes  use  of  the  following  simple-to-prove  lemma. 
Lemma.    Let  K  he  &  polyhedron  in  3?"  defined  by  the  linear  inequalities: 

/,(x)  =  a,,  I  =  1,...  ,p 

gj(x)<bj,  j  =  l,...,q. 

If  xo  is  a  vertex  of  K  and  A  =  {j  :  ffj(xo)  =  6j},  then  \  A  \>  n  -  p. 

Theorem  3.3  Let  \\  .  \\  be  an  arbitrary  norm  in  3?'^  and  assume  that  Xi  +  •  — h  x„  =  x  with 
II  ^t  II  <  1  (for  all  i).  Then  there  exists  a  ■permutation  it  such  that  for  all  positive  integers 
k  <  n 


'  k-d 

.=1  " 


<    d.  (1) 


Proof.    We  construct  a  chain  of  sets  Ad  C  Ad+i  C  •••  C  A„  =  {l,...,n}  and  numbers 
X'i^{k  =  d,d  +  1, . . . ,  n;  z  G  Ak)  with  the  properties: 

I  Ak  \=  k,0  <  A',  <  1,    Y.K  =  f^-d^    E  ^1^.  =  -— -^- 


Base  case: 


fl  —  (j[ 

k  =  n,  A„  =  {l,...,n},  A;^  = 


n 

Inductive  step:  k  +  1  — »  k.  Consider  the  set  A'  of  collections  {fii,i  €  Ak+i}  with  the 
properties: 

Y]    Hi  =  k  -d,       V    /iiX,  =  X, 

0  <  /x.  <  1. 

From  this  set  of  fc  +  1  fi's  we  want  to  construct  a  new  set  of  atmost  k  /i's,  while  mak- 
ing sure  the  partial  sum  of  x,'s  is  still  low.  Notice  that  K  is  convex  and  compact  in 
3?*^''"^  and  nonempty.  Let  {/i,, i  G  Ak+i}  be  a  vertex  of  K.  By  the  previous  lemma 
I  {i  :  /x,  =  0  or  1}  I  >  (A;  +  1)  —  (d  +  1)  =  A;  —  <i.  This  means  at  least  one  p.  is  zero  (or  else  the 
sum  oi  jl's  would  be  more  than  k  —  d).  Let  jlj  =  0.  We  put  Ak  =  Ak+\\{j],  X'k  =  /<«(i  G  Ak), 


completing  the  inductive  construction.  To  conclude: 


We  put  {7r(z)}  =  A,  \  A,_i  (i  =  d  +  I, . . .  ,n),  with  tt  otherwise  arbitrary.    For  k  <  d, 
inequality  (1)  obviously  holds  good.  For  k  >  d  +  1, 


t=i 


k-d 


X^(i)  - 


n 


E  (1  -  K>' 


ieAk 


<     Y.i^-K)  =  d- 


ieAk 


In  1974  Belov  and  Stolin  [4]  using  Steinitz  lemma  gave  a  near  optimal  algorithm  for  the 
NP-Complete  flow  shop  problem.  The  m-  machine  n-  job  flow  shop  problem  can  be  stated 
as  follows.  The  shop  is  an  ordered  set  of  m  machines  on  which  n  jobs  are  to  be  executed. 
Each  job  consists  of  m  non-preemptive  operations,  the  jth  operation  of  any  job  precedes  its 
{j  +  l)th  operation,  and  further,  the  jth  operation  has  to  be  carried  out  on  the  jth.  machine. 
Given  the  execution  times  of  all  operations,  we  must  give  the  order  of  the  executions  of  the 
operations  on  the  m  machines  so  that  finish  time  be  minimal.  This  problem  is  known  [6]  to 
be  NP-Complete  for  m  >  2.  Belov-Stolin's  algorithm  finds  a  permutation  schedule  that  is 
near  optimal  in  the  sense  that  its  error  is  independent  of  the  number  of  jobs.  Independently 
of  Sevast'yanov's  result,  and  around  the  same  time  Barany  [2]  gave  an  algorithm  proving 
Bmin  ^  |c?  with  complexity  0{n'^d^-\-nd'^).,  which  in  turn  yielded  an  approximation  algorithm 
of  complexity  0{n^m^  +  nm'*)  for  the  flow  shop  problem.  Thus  Barany's  algorithm  runs  in 
time  polynomial  both  in  m  and  n,  whereas  the  previous  algorithms  were  exponential  in  m. 
We  present  an  online  of  the  algorithm  here,  while  omitting  the  details  and  the  translation 
between  the  Steinitz  lemma  and  the  flow  shop  problem. 

Theorem  3.4  For  a  finite  set  V  =  {vi, . . .  ,Vn}  Q  3?''  with 

n 

5^i;. -0  and  \\v,\\  <1  (r  =  1, . . .  ,  n), 
we  find  a  permutation  z'l  . .  . ,  in,  in  time  0{n'^d^  +  nd'*),  such  that  for  all  positive  k  <  n 


Zs 


j=i 


< 
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Proof.  The  proof  is  by  construction  of  the  desired  permutation.  Call  a  map  7  :  F  -^  3?  a 
linear  dependence  (for  v)  if  J2vev  t{'v)v  =  0.  Further  let  A  =  {v  E  V  :  7(1;)  =  1},  C  = 
{v  e  V  :  7(u)  =  0},  B  =  {v  e  V  :  0  <  j{v)  <  1}.  Then  the  idea  is  to  construct  a  finite 
sequence  of  hnear  dependences  70, 7i,  •  •  • ,  7i  satisfying  the  conditions: 

(a)  0  <  7.(t;)  <  1  for  all  v, 

(b)  I  B,  \<  d, 

(c)  A,+r  D  A,,  Ak  =  V, 

(d)  I  5,+i  U  A.+i  \  >1.  \<2d. 


Conditions  (a)  and  (b)  together  guarantee  that  the  sum  of  the  vectors  from  A,  is  near  to 
zero.  Condition  (d)  makes  sure  that  the  progress  is  not  too  rapid,  so  the  sum  of  all  vectors 
from  A,  and  a  few  vectors  from  A,+i  \  Ai  is  also  near  to  zero.  The  algorithm  starts  with  the 
trivial  linear  dependence  70  =  0,  and  terminates  with  7^  =  1.  The  construction  of  the  se- 
quence is  inductive,  and  makes  repeated  use  of  the  following  lemma  which  is  a  modification 
of  the  well-known  Caratheodory's  theorem. 

Lemma.  Let  V  C  3?'',  |  V  |=  n  and  let  A  be  a  nontrivial  hnear  dependence  over  V 
and  v'  be  an  arbitrary  vector  from  D  =  {v  :  X{v)  >  0}.  Then  we  can  find  in  0(n(P)  steps 
another  nontrivial  linear  dependence  a  such  that  for  E  =  {v  :  a{v)  >  ^} 

u*  G  f:  and    \E\<d+l. 

Proof.  If  I  D  |<  d  -I-  1,  then  we  are  done.  If  not,  then  let  G  C  D  \  {v*]  be  an  arbitrary 
set  with  d  +  \  elements.  Then  we  find  a  nontrivial  solution  of  the  following  linear  system  of 
d  equations  in  d  -f  1  variables  :  Evev  ^J'{v)v  =  0.  Note  that  ^  is  a  hnear  dependence,  since 
we  put  /z(v)  —  0  for  r  ^  G.  We  choose  an  appropriate  io  so  that  a  =  \ -\-  tofi  is  a  Hnear 
dependence  and  maps  atleast  one  v  E  D  to  zero. 

Clearly,  we  can  replace  a  by  A  and  repeat  the  whole  procedure  till  |  E  \<  d  +  I.  Since 
we  make  progress  each  time,  we  stop  in  at  most  n  iterations.  Thus  we  find  a  in  time  0{nd^)  . 

Remark.  We  shall  see  in  the  following  subsection  that  the  above  technique  is  useful  in 
rounding  certain  real  variables  to  ±1  . 

We  refer  to  Barany's  paper  for  the  proof  of  the  following  theorem. 

Theorem  3.5    There  is  a  permutation  schedule  for  which  the  finish  time  T  satisfies 

'3m  —  1' 


M    <    T    <    M  +  {m-l) 
And  the  schedule  can  be  given  in  0{n^m^  +  nm'*)  steps. 

3.3      Balancing  Matrices  and  switching  lights  in  Bell  Labs. 

The  game  we  are  about  to  describe  is  attributed  to  David  Gale  and  Elwyn  Berlekamp.  It 
■was  designed  and  built  by  Berlekamp  and  was  supposedly  a  fixture  in  the  tea  room  of  the 
mathematics  department  at  Bell  Labs.  The  game  consists  of  an  n  x  n  array  of  lights  and 
2n  switches,  one  for  each  row  and  column.  Each  switch  when  thrown  changes  each  light  in 
its  line  from  off  to  on  or  from  on  to  off.  The  signed  discrepancy  is  defined  as  the  number 
of  lights  on  minus  the  number  of  ligts  off.  The  object  is  to  minimize  the  discrepancy,  which 
is  the  absolute  \'alue  of  the  signed  discrepancy.  When  n  is  odd  the  discrepancy  is  always 
at  least  1,  and  when  n  is  even  the  signed  discrepancy  remains  in  the  same  residue  class 
modulo  4  under  hne  shifts.  In  1970,  J.  Komlos  and  M.  Sulyok  [9]  proved  that  a  player  can 


always  achieve  this  minimal  discrepancy.  J.  Beck  and  J.  Spencer  gave  a  different  proof, 
which  is  algorithmic  in  the  sense  one  could  find  the  necessary  row  and  column  shifts  in 
polynomial  time.  Their  algorithm  uses  a  technique  of  linear  algebra  which  we  shall  call 
"iterative  round-off" . 

Mathematically  speaking,  a  configuration  of  the  game  is  denoted  by  a  matrix  A  =  (oij) 
where  a,j  =  ±1  denotes  the  corresponding  light  being  on  (or  off).  The  row  and  column 
shifts  are  denoted  by  x,  and  ?/_,,  which  take  values  ±  1.  Beck  and  Spencer's  result  can  now 
be  stated  as 

Theorem  3.6  Given  any  n  x  n  matrix  A  —  (a,j)  with  all  Uij  —  ±1  ,  we  can  find  (in  time 
0{n'^))  xi, . . .  ,x„,  j/i, . . .  ,yn  =  ±1  so  that,  setting 

n       n 

D   =  I  J2J2y,Xjaij  I, 
»=i j=i 

D  =  1,  when  n  is  odd,  and  D  <  2,  when  n  is  even. 

Proof.  (Outline).  Let  tj  denote  the  z'-th  row  sum  of  a  configuration.  The  idea  is  to  use 
column  shifts  till  the  r,  have  an  "appropriate"  form  and  then  apply  only  row  shifts  at  the 
last  step.  Note  that  the  line  shifts  are  commutative  and  of  order  2.  The  following  lemma 
describes  the  appropriate  form  of  the  r,. 

Lemma.  Given  any  initial  configuration  there  exist  column  shifts  Xj  so  that  the  new  row 
sums  satisfy 

I  r,  |<  2i,    I  <i  <n. 

Proof.  We  have  r,-  =  J2'j=i  o-ij^j-,  1  <  i  <  n.  Consider  the  r^  as  linear  forms  on  variables 
Xi, . . .  ,x„,  with  Uij  fixed.  Starting  with  all  Xj  equal  to  zero,  we  want  to  iteratively  round 
each  Xj  to  ±  1.  Let  us  call  a  variable  Xj  fixed  if  Xj  =  ±  1  and  floating  if  —1  <  Xj  <  +1. 
Once  a  variable  is  fixed  it  does  not  change.  At  a  typical  stage  in  the  iteration  suppose  there 
are  exactly  i  floating  variables  (  say  Xi, . . .  ,x,)  and  that  the  first  i  row  sums  all  equal  zero. 
Dropping  the  i-th  row,  consider  the  system  of  i  —  1  linear  equations  r^  =  0,  I  <  t  <  i  —  1, 
in  i  variables  xi, . . .  ,x,;  (treating  the  fixed  x_,  as  constants.)  We  now  use  the  same  trick  as 
was  used  in  Barany's  lemma.  We  find  a  line  of  solutions 

(xi,---,x-)  =  (xi,...,x,)  + A(ci,---,c,) 

where  (ci,  •  •  • ,  c.)  is  found  by  Gaussian  elimination  on  the  system  of  i  —  1  equations  in  time 
0{i^).  Set  A  equal  to  that  real  of  minimal  absolute  value  such  that  some  x<  becomes  fixed. 
Replacing  the  old  x_,  with  the  new  x^,  we  have  at  most  i  -  1  floating  variables,  while  the 
first  i  —  1  row  sums  still  equal  zero. 

Initially  we  have  n  floating  variables  and  n  row  sums  equal  to  zero.  Hence  we  apply 
the  above  procedure  at  most  n  times  to  round  Xi,  •  •  •  ,x„.  We  can  now  verify  that  these 
Xj  satisfy  the  Lemma.  At  first  with  i  floating  variables  the  row  sum  is  still  zero.  After 
that  each  of  these  i  variables  changes  by  less  than  two!  And  since  Xj  is  multiplied  by  a.j, 
I  a.j  I  <  1,  the  row  sum  changes  less  than  2i.  The  total  time  complexity  is  ©(n"*)  . 
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The  minimization  of  discrepancy  is  completed  with  the  following  "greedy  technique"  that 
yields  the  final  row  shifts.  Let  Si, . . .  ,s„  be  nonnegative  integers  and  let  K  he  a  positive 
integer  such  that  Si  <  A',  and  for  2  <  i  <  n, 

Si+i  <  Si-\ ^  Si  +  K. 

Then  there  exist  yi , . . . ,  y„  =  ±  1  so  that 

I  yisi  H h  ynSn  \<  K. 

The  y,  are  found  by  reverse  induction.  Set  y„  =  +1.  And  having  found  y„,y„_i, . . .  ,y,+i 
we  choose  y,  =  ±  1  so  as  to  minimize  the  absolute  value  of  the  partial  sum  YJi-^Vt^t-  Our 
condition  on  the  5,  assures  that  we  never  get  "stuck"  and  that  the  final  sum  has  the  desired 
property! 

Caution  :  Given  an  arbitrary  configuration  we  apply  the  lemma  so  that  |  r,  |<  2i.  We 
then  apply  row  shifts  so  that  all  row  sums  are  nonnegative.  We  set  iv  =  2  (say,  n  is  even), 
however,  we  may  not  immediately  apply  the  Greedy  Technique  because  we  may  have  too 
many  r,  =  0.  To  set  the  "field  ready"  for  the  technique  to  work  we  need  to  reorder  the 
rows  and  use  a  couple  of  more  column  shifts.  The  details  are  interesting  though  tend  to  get 
cumbersome  towards  the  end.  And  we  stick  to  our  policy  of  omitting  them. 

Spencer  [14]  discusses  other  games  such  as  attempting  to  maximize  the  discrepancy  in 
detail. 

4     Lovasz  Local  Lemma 

Noga  Alon  [1]  gives  a  detailed  account  of  the  applications  of  the  local  lemma.  In  this  section 
we  review  some  of  the  interesting  ones.  We  mentioned  that  the  techniques  of  the  previous 
section  and  the  local  lemma  yield  strongest  results  when  the  basic  probabilistic  method  fails. 
However,  the  local  lemma  (unlike  the  rounding  techniques)  yields  existential  solutions.  And 
thus  the  results  obtained  seem  typically  nonconstructive  in  nature. 

A  hypergraph  H  =  {V,E)  has  property  B,  if  V  is  two-colorable  so  that  no  edge  E  is 
monochromatic.  The  following  is  a  simple  corollary  of  the  local  lemma. 

Theorem  4.1  Let  H  be  a  hypergraph  in  which  every  edge  has  atleast  k  elements,  and  sup- 
pose that  each  edge  of  H  intersects  with  at  most  d  other  edges.  If  e{d  +  1)  <  2'^"*  then  H 
has  property  B. 

We  now  describe  a  recent  geometric  result  due  to  Mani  and  Pach  [10].  A  family  of  open 
unit  balls  F  in  the  3-dimensional  Euclidean  space  3J^  is  called  a  k-fold  covering  of  3?^  if  any 
point  X  G  ^^  belongs  to  at  least  k  balls.  A  A;-fold  covering  J^  is  called  decomposable  if  there 
is  a  partition  of  J^  into  two  pairwise  disjoint  families  J^i  and  J^2,  where  each  is  a  covering  of 
di^.  Mani-Pach's  result  implies  that  any  fc-fold  covering  of  3?''  in  which  no  point  is  covered 
by  more  than  c2'''^  balls  is  decomposable.  It  is  indeed  mysterious  that  it  is  more  difficult  to 
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decompose  coverings  that  cover  some  of  the  points  too  often,  than  to  decompose  coverings 
that  cover  every  point  about  the  same  number  of  times.  It  is  less  intriguing  when  one  finds 
out  that  the  proof  makes  use  of  Lovasz  local  lemma.  If  coverings  of  points  are  considered 
as  events  in  the  probability  space,  then  covering  the  same  point  corresponds  to  dependency 
between  different  events.  And  the  use  of  the  local  lemma  requires  the  dependencies  be  rare. 
Precise  formulation  is  as  follows. 

Theorem  4.2  Let  T  =  {S,},^^  he  a  k-fold  covering  of  3?"'  by  open  unit  balls.  Suppose, 
further,  that  no  point  of  3?"'  is  contained  in  more  than  t  members  of  T .  If 

^^3218/2^-1    <   1 

then  T  is  decomposable. 

Proof.  Consider  the  connected  components  {Cj]j^j  of  the  set  obtained  from  3?"^  by 
removing  the  boundaries  of  the  balls  5,  in  ^.  Let  H  be  the  (infinite)  hypergraph  defined  as 
follows.  The  balls  in  ^  form  the  vertex  set;  there  is  a  hyperedge  Ej  between  all  the  vertices 
(balls)  containing  the  connected  component  Cj.  Since  .F  is  a  A;-fold  covering  ,  we  know  that 
each  edge  Ej  contains  at  least  k  vertices.  Furthermore,  if  we  know  that  each  edge  intersects 
less  than  t^2^^  other  edges,  then  by  Theorem  4.1  H  has  property  B\  We  simply  let  J^i  be 
the  set  of  all  blue  balls  and  .7^2  be  the  set  of  all  red  ones.  Clearly  each  .F,  is  a  covering  of  3?^ 
and  we  are  done  modulo  the  claim  that  edge-edge  dependency  <  t^2^^. 

To  prove  this  claim,  fix  an  edge  Ei  corresponding  to  the  connected  component  C;.  An 
arbitrary  edge  Ej  intersects  with  Ei  means  there  is  a  ball  Bi  containing  both  Ci  and  Cj. 
so  any  ball  that  contains  Cj  intersects  Bi.  It  follows  that  all  the  unit  balls  that  contain  or 
touch  a  Cj  for  some  j  that  satisfies  Ej  r\  Ei  ^  (f)  are  contained  in  a  ball  of  radius  4.  As 
no  point  of  this  ball  is  covered  more  than  t  times,  the  total  number  of  these  unit  balls  is 
at  most  <.47r4^/47r  =  t.2^.  It  can  be  checked  that  m  balls  in  3?^  cut  3?^  into  less  than  vn? 
connected  components.  And  since  each  of  the  above  Cj  is  such  a  component,  we  have,  as 
claimed, 

\{j:E,r\Eii^<i>}  \<{t.2y  =  t^2^^. 

important  technicality:  Note  that  the  above  hypergraph  H  is  infinite  whereas  Theorem  4.1 
holds  only  for  a  finite  H.  Thus  a  standard  compactness  argument  is  essential  to  complete 
the  proof  . 


The  general  case  of  the  local  lemma  is  a  statement  about  the  situation  in  which  the 
events  A,  have  different  probabilities  . 

Lovasz  Local  Lemma  (General  case).   Let  Ai,---,A„  be  events  with  dependency  graph 
G.  Assume  there  exist  Xi,  •  •  • ,  x„  6  [0, 1)  with 

Pr[A.]<x.     n    (1-^.) 
{0}eG 
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for  all  i.  Then 

Pr[Ai.]  >  n(l  -  ar.)  >  0. 

i=l 

The  following  is  an  example  where  we  have  events  with  two  different  probabilities  and 
the  general  case  comes  to  the  rescue. 

Theorem  4.3  (Noga  Alon)  Lei  H  =  {V,E)  be  a  graph  with  maximum  degree  d,  and  let 
y  =  Vi  U  V2  U  •  •  ■  U  Fr  be  an  arbitrary  partition  of  V  into  r  disjoint  sets.  There  is  a  small 
constant  c(=25)  so  that  if  each  set  Vi  has  atleast  cd  vertices,  then  there  is  an  independent 
set  of  vertices  W  C.V,  that  contains  at  least  one  vertex  from,  each  Vi. 

Proof.  Let  us  assume,  for  convenience,  that  each  Vj  is  of  cardinality  precisely  cd.  We 
pick  each  vertex  of  H  randomly  and  independently,  with  probability  p  =  l/crf,  to  form  the 
set  W.  We  formulate  two  sets  of  bad  events.  For  each  i,l  <  i  <  r,  let  A,-  be  the  event 
that  W  nVi  =  <f);  and  for  each  edge  /,  let  B}  be  the  event  that  W  contains  both  ends  of  /. 
Clearly, 

Pr(A.)  =  (1  -  p)"^  and  Vi{Bj)  =  p" 

Moreover,  there  is  a  dependency  digraph  for  these  events  in  which  each  A,  node  is  adjacent 
to  at  most  cd?Bj  nodes  (and  to  no  Aj  nodes),  and  each  Bj  node  is  adjacent  to  at  most  2  Ai 
nodes,  and  at  most  2{d  —  1)  Bj'  nodes.  It  follows  from  the  general  case  of  the  local  lemma 
that  if  we  can  find  two  numbers  x  and  y,0<a:,y<lso  that 

[l-pr     <     xil-yf''  (2) 

p'     <     y{l-yr-\l-xf  (3) 

then  Pr(AA,  A  Bf)  >  0.  To  complete  the  proof,  one  has  to  check  that  c  =  25,  x  =  1/2, 
y  =  1/100J2  satisfy  (1)  and  (2)  . 

We  present  here  the  following  improvement  . 

Claim:  |  Vi  |>  10c?  suffices  in  Theorem  4.3  . 

Proof.  Let  us  assume  that  |  Vj  |=  cd,  and  try  to  minimize  c.  We  choose  vertices  out 
of  each  Vi  with  probability  p  =  XJ cd,  randomly  and  independently.  Preliminary  calculation 
suggests  we  choose  y  to  be  of  the  form  fx/cd}.  And  it  turns  out  that  the  minimum  value  for 
c  is  attained  when  A  =  2/x  ! 

Inequalities  (1)  and  (2)  can  now  be  written  as 

e"^     <     xe~^ 


{cdf  cd^ 

This  gives  x  =  e^~^,  since  the  second  inequality  requires  x  to  be  as  low  as  possible.  Ignoring 
(1  -  y)^''~^  which  is  close  to  e"^''/'"^,  we  get 

Cmin  =    min  — :-—  where  A  >  ^  >  0 

/i(l  —  e^~^y 
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More  calculation  yields 


A    =     2^ 
(2/i  +  l)    =     e^ 


giving  /i  =  1.2564312  and  c„,„  =  9.82613  . 
Thus,  if  we  start  with  |  Vi  \>  lOd,  we  can  choose 

/i  =  5/4(=  1.25),  A  =  5/2,  p=  l/4J,x  =  e-^'\  and  y  =  l/8d^ 

to  satisfy  the  inequalities  (1)  and  (2)  . 

Recall  that  R{k^t)  is  the  minimal  n  so  that  if  the  edges  of  7v„  are  two-colored  Red  and 
Blue  then  either  there  is  a  Red  Kk  or  a  Blue  A'(.  The  local  lemma  provides  a  simple  way  of 
improving  the  lower  bounds  for  R{k,t).  As  an  example,  it  is  not  difficult  to  show  that 

if  e(  f  2)  (jt  "  2)  +  l)-2''^*^  <  1  then  iZ(fc,  k)  >  n. 

Further  calculation  shows  that  R{k,k)  >  ^(1  +  o(l))A;2''/^,  yielding  only  a  small  constant 
factor  improvement.  The  dependencies  being  rare,  things  are  brighter  when  we  apply  the 
lemma  to  the  off-diagonal  Ramsey  numbers  R(k,t),  with  one  of  the  parameters  small.  We 
can  prove,  using  the  lemma,  that 

R{3,t)>{^-o{i))ty\ogH 

and  that 

There  is  no  known  better  bound  for  i?(4,  t)  without  using  the  local  lemma.  And  this  still 
leaves  a  gap,  since  the  upper  bound  is  i?(4,  <)  <  t^+°('^'>  . 

We  conclude  this  section  with  yet  another  application  of  the  local  lemma.  (Strictly 
speaking,  this  is  a  special  case  of  Theorem  4.1.)  Consider  n  points  and  n  sets  made  out  of 
these  n  points.  Suppose,  further,  that  each  set  has  exactly  10  points  and  that  each  point 
is  in  exactly  10  sets;  i.e.  the  degree  of  each  set  and  the  degree  of  each  point  equal  10.  The 
local  lemma  guarantees  the  existence  of  a  two-coloring  of  the  n  points  such  that  no  set  is 
monochromatic!  While  there  exists  no  known  efficient  deterministic  algorithm  for  finding 
such  a  coloring,  we  report  here  the  success  of  a  randomized  experiment.  The  following  pro- 
cedure was  suggested  by  Noga  Alon  and  Joel  Spencer. 

Randomized  Recoloring. 
Step  1.  Color  the  n  points  Red  or  Blue  randomly  and  uniformly  with  equal  probability. 
Step  2.    Set  M  to  be  the  collection  of  monochromatic  sets  at  this  stage.    If  M  is  empty 
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terminate,  else  continue. 

Step  3.  Randomly  recolor  all  the  (underlying)  points  in  the  union  of  monochromatic  sets  in 

M.  And  go  to  Step  2. 

Theoretically  speaking,  the  procedure  is  not  even  known  to  terminate,  and  thus  is  not 
known  to  be  an  algorithm.  In  practice,  however,  we  found  that  the  recoloring  technique 
works  quite  well.  In  fact,  there  seems  to  exist  a  two-coloring  even  when  the  degree  =  4. 
The  table  below  contains  some  of  the  results  of  our  experiment  on  the  computer. 


degree  —  9 

degree  =  4 

N  =  175000 
mo  =  658 

N  =  200000 

N  =  175000 

N  =  125000 

mo  =  24933 

mo  =  21951 

mo  =  15606 

mi  =  80 

mi  =  16027 

mi  =  14061 

mi  =  10012 

16 

11038 

9649 

6962 

4 

7947 

6958 

4929 

1 

5861 

5042 

3447 

0 

4258 

3702 

2583 

3098 

2548 

1957 

N  =  150000 

2260 

1883 

1450 

mo  =  604 

1618 

1373 

1029 

mi  =  88 

1135 

1046 

757 

8 

798 

808 

548 

2 

623 

614 

373 

0 

487 

454 

292 

337 

340 

186 

N  =  150000 

mi  =  235 

m,  =  233 

m,  =  129 

mo  =  598 

166 

195 

101 

mi  =  94 

124 

153 

86 

8 

92 

no 

59 

1 

66 

81 

44 

0 

42 

57 

32 

26 

35 

21 

N  =  100000 

20 

21 

13 

mo  =  397 

15 

23 

8 

mi  =  50 

15 

17 

4 

8 

6 

16 

3 

1 

3 

9 

1 

0 

1 

8 

0 

1 

5 

0 

2 
1 
0 

m.i  =  number  of  monochromatic 
after  the  ith  recoloring  step. 


sets 
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5      Future  Work 

Our  primarj-  interest  in  this  study  is  two-fold.  Firstly  to  attempt  providing  either  new 
techniques  or  new  ways  of  using  the  existing  techniques  that  turn  randomized  solutions  into 
deterministic  versions.  Secondly  to  experiment  with  randomization  in  an  attempt  to  turn 
existential  proofs  into  effective  methods.  More  specifically,  we  mention  the  following  hard 
problems  and  HARD  PROBLEMS! 

hard  problems 

The  flow  shop  problem  and  many  other  scheduling  problems  were  shown  to  be  NP-hard 
by  reduction  from  the  3-PARTITION  problem.  It  is  natural  to  wonder  if  Sevast'yanov's  re- 
sult is  useful  in  a  more  general  setting;  i.e.  in  providing  an  efficient  approximation  algorithm 
for  the  other  problems. 

The  method  of  conditional  probabilities  and  other  methods  describe  sequential  rounding; 
is  there  an  efficent  way  of  deterministically  rounding  in  parallel? 

It  is  interesting,  but  seems  hard,  to  find  a  way  around  the  local  lemma.  Thus  the  prob- 
lem is  to  provide  constructive  proofs  for  any  of  the  results  guaranteed  by  the  local  lemma. 
As  demonstrated  in  the  previous  section  there  seems  to  be  some  hope  in  randomization. 

HARD  PROBLEMS 

Barany's  technique  (Theorem  3.5)  is  useful  in  finding  a  good  permutation  schedule.  The 
open  problem  is  to  find  a  similar  translation  between  the  Steinitz  lemma  and  the  flow  shop 
problem,  yielding  a  good  (hopefully  better)  general  schedule. 

Can  the  Lovasz  local  lemma  be  implemented  by  an  eflficient  algorithm  ?  In  Spencer's 
words,  can  we  find  a  "needle  in  a  haystack"  in  polynomial  time? 

What  is  the  \-alue  of  R{k,4)? 
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